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One  new  application  of  knowledge  baaed  programing 
is  in  the  area  of  knowledge  based  snip 
classification  via  the  processing  of  acoustic  data 
from  a  distributed  sensor  network.  Specifically , 
we  are  investigating  tr.e  automatic  generation  of 
the  harmonic-set  formation  portion  of  tne  SIAP 
system  LDrazovich  et  al.-79j.  In  addition,  we  are 
working  on  basic  issues  so  tnat  similarly 
developed  systems  can  be  applied  in  ctner 
important  military  mission  areas,  suen  as  radar 
ship  classification. 
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1.  Introduction 

This  paper  describes  the  knowledge  based  approach 
to  program  acquisition  and  synthesis >^aad  is  based 
upon  an  earlier  work'TGreen  TT^i?Ci5rte-7oT.  Several 
diverse  oobentiai  applications  are  mentioned, 
including  synthesizing  the  harmonic-set  formation 
.nodule  of  an  acoustic  signal  understanding  system 
for  knowledge  based  ship  classification,  creating 
retrieval  and  analysis  programs  for  snip 
classification  hypothesis  structures,  and  making 
operating  systems  and  programming  languages  more 
portable.  . 

Progress  in  knowledge  based  systems  is  exemplified 
oy  a  description  of  ?SI,  a  knowledge  based 
programming  system  that  acquires  high-level 
descriptions  cf  programs  and  produces  efficient 
implementations  of  them  [Green-76].  Simple 
symbolic  computation  oregrams  are  specified 
through  dialogues  that  include  natural  language, 
input-output  pairs,  and  partial  traces .  The 
programs  produced  are  in  LIS?,  but  experiments 
have  snewn  that  the  svstem  can  be  extended  to 
produce  coce  in  a  block  structured  language  suen 
13  PASCAL. 


A  knowledge  based  program  synthesis  system  works 
13  follows.  Tne  user  want3  a  performance  program 
of  some  type,  for  example,  a  news  story  indexing 
program.  The  user  must  be  versed  in  the 
application  area,  though  need  not  b$  an  expert  at 
programming.  The  user  conducts  a  dialogue,  using 
natural  language  as  well  as  traces,  examples,  ana 
verv  high-level  languages  to  describe  the  desired 
program.  From  this  specification  the  knowledge 
cased  svstem  synthesizes  an  efficient 
implementation  of  the  orograa.  Since  program 
specification  and  use  is  an  evolutionary  process, 
eaen  successive  version  of  the  candidate  target 
orograa  is  tested  oy  the  user.  Modifications  in 
specification  result  in  successive  versions  of  the 
target  program  as  new  retirements  develop  or 
previous  requirements  are  clarified. 

The  prototype  ?SI  system  has  beer  designed  and 
implemented'.  It  performs  as  described  aoove.  In 
particular,  programs  have  been  generated  from 
sr.glisn  dialogues  for  i  variety  of  domains.  Among 
these  programs  are 

•  CLASS ,  a  simple  pattern  classif icat ion 
program  whicn  requires  much  of  the 
programming  knowledge  necessary  for  acre 
complex  programs; 

•  MEWS*  an  information  retrieval  program; 

•  TF,  a  theory  formation  program  which 
"learns"  Iforasj  an  internal  model  of  a 
concept  ay  repeatedly  examining  examples 
of  tne  concept;  and 

•  Sorting  programs,  using  efficient 
techniques  for  specific  sorting 
requirements. 

A  new  system.  CHI,  is  being  developed  at  Systems 
Control  for  seme  of  tne  potential  applications 
discussed  in  tne  following  sections. 


In  the  undersea  surveillance  application  area  the 
signal  understanding  program  is  designed  to 
produce  a  description  of  the  ocean  scenario, 
cnanging  with  time,  that  indicates  the  platforms 
in  the  ocean  that  are  generating  tne  signals  being 
perceived  by  the  sensors  of  the  undersea 
surveillance  system.  The  ships  and  submarines 
being  detected  anc  tracked  are  in  a  noisv  ocean 
environment  that  may  also  contain  other  ships  of 
no  interest  to  the  surveillance  system. 

Suppose  an  analyst  is  attempting  to  analyze 
signals  being  received  by  hydrophones  directed 
toward  a  group  of  suoaarines  and  other  platforms 
in  a  noisy  environment.  The  analyst  must  find  the 
location  and  type  of  each  snip  and  associate  eacn 
frequency  found  with  a  likely  source.  This  task 
is  known  to  be  quite  difficult.  It  far  exceeds 
the  capabilities  of  any  straightforward  parametric 
classification  or  pattern  recognition  avstea.  It 
is  at  the  limit  of  what  is  achievaole  ov  knowledge 
baaed  signal  understanding  systems  consisting  or’ 
large  rule  bases  and  orograms  that  model  the 
sources  (tlades,  shafts,  pumps,  etc.),  harmonic 
and  ratio  relations  of  sources,  types  of  sources 
on  platforms,  operational  patterns,  the  ocean 
environment,  the  noise  sources,  maximum  speeds  of 
the  platforms,  whether  the  locations  are  shipping 
lanes,  and  so  forth. 

If  tr.e  platforms  decide  to  change  their  sound, 
they  could  disguise  themselves  effectively  oy 
changing  their  source  characteristics  (e.g..  by 
using  tone  altering  synthesizers,  running  c»cse 
together,  using  alternate  pumps  ana  acoustic 
masking  devices,  running  near  sound  reflecting 
structures,  and  altering  their  coerating 
patterns).  The  situation  can  be  further 
comolicated  by  the  introduction  of  new  tvpes  cf 
microphones  or  signal  processing  systems.  With 
these  kinds  of  changes  happening,  the  problem  is 
challenging  indeed.  The  problem  is  that  the 
signal  understanding  program  would  have  to  ce 
preprogrammed  to  anticipate  each  oossiole  change 
m  data  rates,  harmonic  structures,  amplitude  and 
frequency  modulation,  etc.  Similarly,  any 
approach  using  learning  would  also  have  to 
anticipate  all  of  the  types  of  changes  that  could 
be  expected  and  have  to  be  able  to  search  the  verv 
large  space  of  possible  changes  to  find  new 
patterns.  It  is  quite  unllkelv  that  it  will  be 
possible  to  anticipate  all  such  changes. 

A  more  reasonable  approach  might  be  to  allow  the 
signal  understanding  system  to  be  reprogrammed  to 
respond  to  new  patterns.  The  difficulty  r.cv  is 
reprogramming  and  debugging  a  complex  system  ir. 
the  snort  time  allowed  In  a  tactical  situation. 

The  necessary  reprogramming  ana  debugging  is  cf 
course  a  slow  process.  A  solution  to  the 
reprogramming  problem  is  to  use  a  procram 
synthesis  system  to  automatically  reprogram  cr 
modify  existing  programs  and  data  structures  to 
meet  the  new  requirements . 

A  scenario  for  the  response  to  new  signal 
cnaracteristics  might  begin  with  the  signal 
understanding  system  failing  to  respond,  providing 
information  Inconsistent  with  other  observations, 
or  reducing  certainty  factors  for  identified 
sources.  We  assume  that  the  appropriate  portion 
of  the  existing  signal  understanding  svstem  was 
itself  automatically  synthesized,  and  tnat  the 
associated  explainer  module  could  describe  in 
English  the  oerformance  of  the  system  help 
pinpoint  the  t/ee  of  signal  changes  that  are 
causing  tne  problem. 
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The  user  requests,  in  Snglish,  orobea  into  the 
data  to  look  for  any  patterns  that  characterize 
sources  tnat  sees  to  cause  trouble.  A  user  say 
know  froa  another  observation  the  identity  and 
location  of  a  particular  source.  If  so,  the  user 
could  request  a  learning  program  to  fine  patterns, 
expressed  as  rules  that  characterize  that  source. 
The  signal  understanding  system  would  then  oe 
automatically  reprosramae d  to  use  the  new  rules 
and  reanalyze  the  signals.  This  interactive 
process  is  repeated  until  a  satisfactory 
understanding  of  tne  signals  is  achieved. 

A  more  difficult  situation  occurs  when  the  truth 
of  a  situation  cannot  be  established  bv  means  of 
any  controlled  experiments,  which  is  frequently 
the  case.  For  example,  in  a  noisy  ocean 
environment  one  can  never  positively  identify  all 
the  platforms  to  determine  wnat  is  really 
producing  the  signals.  Since  one  cannot  ascertain 
truth,  one  can  only  Judge  that  a  given  signal 
analysis  is  satisfactory  according  to  some  set  of 
criteria.  Then  the  orogram  must  still  find  new 
rules  that  produce  some  satisfactory  model  of  the 
situation,  according  to  criteria  that  an  adequate 
model  of  the  situation  would  satisfy. 

The  difficulty  is  compounded,  in  that  it  will  not 
in  general  oe  possible  to  anticipate  what  criteria 
or  meta-rules  a  satisfactory  model  must  satisfy. 
For  example,  the  analyst  might  suddenly  notice 
chat  the  r.umoer  of  submarines  has  drastically 
increased.  One  might  add  a  constraint  not 
anticipated  oy  the  system  designer  that  it  isn't 
possible  for  submarines  to  replicate  themselves, 
ihe  cause  for  the  increase  in  sound  sources  might 
oe  that  some  sound  generator  was  altered  and  the 
harmonics  produced  were  taxen  as  separate  sources. 
It  would  then  oe  appropriate  to  relax  the 
constraints  on  groupings  of  harmonics  so  that 
previously  disallowed  harmonic  structures  would  oe 
acceptaole  if  they  arise  froa  the  same  locatic-.. 

Another  situation  might  be  that  the  sounds  'aren't 
recognized  because  tne  submarines  began  moving  at 
speeds  that  were  not  anticipated.  One  might  have 
the  system  generate  its  best  hypothesis  that 
assumes  that  anything  moving  very  quickly  or 
erratically  is  not  really  a  submarine,  but  instead 
a  decoy.  One  could  also  add  constraints  for  a 
hypothesis  that  best  explains  all  oosslole  sound 
sources  cr  is  ieast  likely  to  miss  esoecially 
interesting  ones. 

The  first  major  task  for  our  new  system  (called 
CHi;  in  undersea  surveillance  is  reprogramming  the 
harmonic-set  formation  module  illustrated  in 
Figure  l.  The  system  we  develop  will  input  the 
old  signal  classification  module,  plus  new  signal 
classification  rules  in  a  language  natural  for 
expressing  them.  CHI  will  produce  as  output  a 
modification  of  tne  original  signal  classification 
program  which  aopropriately  makes  use  of  these  new 
rules.  The  classification  program  is,  in  this 
:ase,  primarily  a  harmonic-set  formation  program 
eh3t  partitions  the  set  of  frequency  signals  into 
a  Harmonically  related  group  oroduced  by  one 
source  of  one  platform.  The  classification 
program  will  use  as  primitive  operations  (1) 
existing  primitives  of  the  target  programming 
.ar.guage,  (2)  signal  retrieval  commands  to  a  data 
management  system,  and  (3)  subroutines  in  a  simple 
statistics  library. 

A  sore  difficult  task  is  the  writing  of  a  module 
tnat  learns  or  hypothesizes  new  pattern 
classification  rules  on  its  own.  The  input  for 
this  task  is  a  list  of  constraints  that  all  rules 
must  satisfy.  The  output  is  a  program  that 
modifies  old  rule  sets  based  upon  new  signal 
information  about  known  situations. 
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A  knowledge  based  programming  system  would  aid  an 
intelligence  analyst  by  creating  retrieval  and 
analysis  programs  that  probe  into  the  ship 
classification  knowledge  base  and  hypotnesis 
structure  for  the  current  situation.  Suer, 
structures  are  typically  very  large  and  complex, 
and  tne  operational  environment  is  constantly 
changing.  Thus,  no  general-purnose  snip 
classification  system  oased  upon  prestored 
knowledge  will  oe  capable  of  always  presenting  the 
correct  interpretation  or  set  of  feasible 
interpretations.  Therefore,  a  user  such  as  a 
tactical  assessment  officer  snould  be  allowed  to 
Interact  with  the  system  to  confirm  its  current 
nypotnesis  and  to  explore  additional  possibilities 
cased  upon  his  own  oast  experience  plus  knowledge 
of  unique  properties  of  tne  current  situation. 


Such  exploration  sight  include  requests  for  a 
summary  based  upon  retrieval  of  a  totally  novel 
combination  of  data.  For  example,  if  many  vessels 
aren't  being  identified  because  an  important 
characteristic  is  missing  or  occluded,  it  might 
Indicate  that  such  vessels  have  been  modified.  A 
series  of  queries  to  the  system  via  the  knowledge 
oased  programming  frontend  might  reveal  important 
correlations,  e.g.,  that  all  such  vessels  are  of  a 
particular  type  and  all  recently  visited  a 
particular  port  for  an  extended  period. 

The  user  should  also  be  allowed  to  hypothesize 
certain  constraints,  e.g.,  to  attempt  to  fill  in 
missing  data.  The  svstea  would  then  be  rerun  to 
provide  an  analysis  of  the  new  hypothetical 
situation.  If  the  system  is  to  oe  a  fully 
accessible  assistant,  the  user  should  be  able  to 
make  a  complicated  examination  of  its  chain  of 
reasoning  to  verify  system  credibility  and  maxe 
the  system  less  opaque. 


The  only  way  to  allow  such  interactions  to  occur, 
without  narrowly  restricting  them  a  priori  to  ?. 
small  number  of  fixed  formats,  is  to  use  a 
knowledge  based  programming  system  to  construct 
ograms  which  represent  the  required  action.  A 
numan  programmer  who  possesses  the  requires 
expertise  would  be  too  expensive  to  assign  to  each 
ship  classification  system  at  sea,  and  would 
probably  prove  to  be  too  slow  at  the  task  anyway. 


4.  Ofch.r  Pc».-:Cf,l  teBjJLattiflfla 

Another  applieation  area  of  interest  is  the 
automatic  generation  of  transportable  operating 
systems  ana  programming  systems  (e.g..  compilers). 
The  target  "language**  might  be  a  combination  of 
one  or  more  of  a  high-level  language  program, 
machine  language  program,  microcode,  and  very 
large-scale  integrated  (*L2I)  circuit  board 
design.  The  particular  languages  produced  and 
their  combination  would  oe  tailored  to  each 
specific  computer  configuration  presented. 

The  feasibility  of  many  other  applications  may  not 
be  far  off.  The  promise  lies  in  our  approach, 
namely,  that  of  building  a  large  knowledge  based 
system  that  emphasizes  the  codification  of 
underlying  programming  principles  combined  with 
application-specific  expertise.  Some  generality 
has  already  been  demonstrated  by  extending  ?SI  to 
deal  with  ostensibly  different  winds  of  programs, 
using  essentially  the  same  knowledge  base. 


2 


Fifirt  it  Signal  Understanding  Application 


Surface  and  undersea  ptatforae 
in  noipy  environoants 


Q  HYDROPHONES 


SIGNAL 

PROCSSSING 


Frequency  spectrum 

i 


USER 


DTIC 


B 


Accession  Por 

;  NT IS  CFlil 
DTIC  TAP 
Unannounced 

icstior 


Distribution/ _ 

Availability  Codes 
lAvaii  aod/or 


Dlat 


Special 


PSI  Is  organized  as  a  collection  of  interacting 
modules  or  programmed  experts,  as  displayed  in 
Figure  2.  A  simplified  view  of  tne  data  paths  Is 
shown  in  Figure  o.  There  is  one  input  data  path 
for  eacn  specif ication  method.  Currently  these 
are  English,  input-output  examples,  and  partial 
traces.  A  more  conventional  method,  tnat  of  a 
very  high-level  language,  is  a  planned  addition  to 
PSI ,  as  shown  in  Figure  3.  These  specifications 
are  integrated  into  a  single  structure  at  the 
program  net  and  orograa  model  levels. 


PSI' s  operation  may  be  conveniently  factored  into 
two  parts  'see  Figure  2):  the  acquisition  agtf 
vtnose  modules  shown  left  of  the  program  mode^J  , 
vhlon  acquires  the  model,  and  the  synthesis  chase, 
wnicn  procuces  a  program  from  the  model. 


In  the  acquisition  phase,  sentences  are  first 
parsed,  then  interpreted  and  stored  as  a  program 
r.et  structure  iGinsparg-Td  ] .  The  parser  is  a 
general  parser  wnicn  limits  search  by 
Incorporating  considerable  knowledge  of  English 
usage.  The  interpreter  is  more  specific  to 
program  synthesis,  using  program  description 
knowledge,  as  well  as  knovleage  about  the  question 
asxed  and  the  current  topic,  to  facilitate 
interpretation  into  tne  program  net. 


The  dialogue  moderator  guides  the  dialogue  by 
selecting  or  suppressing  questions  for  tne  user. 
It  attempts  to  keeo  PSI  and  the  user  in  agreement 
or.  the  current  topic,  provides  a  review  and 
preview  cf  topics  wnen  the  topic  changes,  helps 
tne  user  who  gets  lost,  and  allows  initiative  to 
shift  between  PSI  and  the  user. 


The  explainer  module  generates  reisonably  clear 
English  questions  about  and  descriptions  of  the 
program  net  as  it  is  acquired,  in  order  to  help 
verify  that  the  inferred  program  description  is 
the  one  desired.  It  Is  aiso  designed  to  explain 
tne  how  and  wny  of  the  acquisition  and  synthesis 
process  tc  the  interested  user. 


Another  tnout  specification  method  is  a  partial 
trace  iPhilllps-77] .  A  trace  includes  as  a 
special  case  an  example  input-output  pair. 

Suaanles  are  useful  for  inferring  cats  structures 
rati  3lmple  spatial  transformations.  Partial 
traces  of  states  of  internal  and  I/O  variables 
allow  the  inductive  inference  of  control 
structures.  The  trace  and  example  inference 
expert  infers  a  loose  description  of  a  program  in 
the  form  of  a  program  net,  rather  than  a  program 
model  or  other  true  algorithm.  This  tecnnioue 
allow  domain  supcort  to  disambiguate  possiole 
Inferences  and  also  separates  the  issue  of 
efficient  implementation  from  tne  Inference  of  the 
user's  intention. 


Various  types  of  programming  knowledge  are 
distributed  throughout  tne  modules  or  the 
acquisition  pnase.  In  contrast,  knowledge 
specific  to  one  particular  application  domain 
(e.g.,  knowledge  about  learning  programs)  la 
concentrated  in  the  domain  expert,  wnich  supplies 
domain  support  oy  communicating  with  other 
acquisition  modules  tnrougn  tne  program  net. 


The  program  net  and  program  model  (see  Figure  3) 
:re  two  of * tne  major  interfaces  within  ?Si.  Both 
are  high-level  program  ana  data  structure 
description  languages.  The  program  model  includes 
complete,  consistent,  and  interpretaole  very 
hign- level  algorithm  and  information  structures. 
The  program  net.  on  the  other  hand,  forms  a  looser 
program  description.  Fragments  of  the  program  net 
can  oe  visited  in  the  order  of  occurrence  in  the 
dialogue,  rather  than  in  execution  order,  and 
aliow  leas  detailed,  local,  and  only  partial 
specification  of  the  program.  Since  these 
fragments  correspond  rather  closely  to  what  the 
user  says,  they  esse  the  burden  of  the 
parser/interpreter ,  as  well  as  the  trace  and 
example  inference  module. 


Tne  program  model  builder  lMcCune-77]  applies 
knowledge  of  correct  program  models  to  convert  the 
fragments  into  a  model.  The  model  ouilaer 
processes  fragments,  checking  for  completeness  and 
correctness,  Fills  in  detail,  corrects  minor 
inconsistencies,  ana  adds  cross-references.  It 
also  generalizes  the  program  description, 
converting  it  into  a  form  that  allows  tne  coder  to 
lock  for  good  implementations.  The  completed 
program  model  may  be  interpreted  by  the  model 
interpreter  to  check  that  it  performs  as  desired 
by  the  user  and  also  to  gather  information  needed 
oy  the  efficiency  expert,  such  as  statistics  on 
set  sizes  and  prooaollities  of  the  outcome  cf 
tests. 

After  the  acquisition  phase  is  complete,  the 
synthesis  pnase  begins.  This  phase  may  be  viewed 
as  a  series  of  refinements  of  the  program  model 
into  an  efficient  program,  or  as  a  heuristic 
search  in  a  refinement  tree  for  an  efficient 
program  that  satisfies  the  program  model. 

The  coder  [3arstov-77]  has  a  body  of  program 
synthesis  rules  tGreen  A  Barstow-75,  Green  4 
Sarstow-77]  wnicn  are  applied  to  gradually 
transform  the  program  model  from  abstract  into 
more  detailed  constructs  until  it  is  in  the  target 
language.  The  algorithm  ana  data  structures  are 
refined  interdepenaently.  The  coder  deels 
primarily  with  the  notions  of  set  and 
ccrresponaence  operations  and  can  synthesize 
programs  involving  sequences,  loops,  simple  input 
and  output,  linxed  lists,  arrays,  and  hash  taoies. 

The  refinement  tree  effectively  form s  a  planning 
space  that  proposes  only  legal,  but  possibly 
inefficient,  programs.  This  tree  structure  is 
Shared  by  tne  coder  and  the  efficiency  expert 
LKant-77j.  When  the  coder  proposes  more  than  one 
refinement  or  implementation,  the  efficiency 
expert  reduces  the  searcn  oy  estimating  the 
time-space  cost  product  of  each  proposed 
refinement.  The  better  path  is  follcwed,  and 
there  is  no  backup  unless  the  estimate  later 
proves  to  be  very  bad.  An  additional  method  to 
reduce  the  size  of  tne  searcn  space  is  the 
factorization  of  the  program  into  relatively 
independent  parts  so  that  all  combinations  of 
implementations  are  not  considered.  An  analysis 
for  bottlenecks  allows  the  synthesis  effort  to 
concentrate  on  the  more  critical  parts  of  the 
program. 
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